Recursive Autoencoders for ITG-Based Translation

نویسندگان

  • Peng Li
  • Yang Liu
  • Maosong Sun
چکیده

While inversion transduction grammar (ITG) is well suited for modeling ordering shifts between languages, how to make applying the two reordering rules (i.e., straight and inverted) dependent on actual blocks being merged remains a challenge. Unlike previous work that only uses boundary words, we propose to use recursive autoencoders to make full use of the entire merging blocks alternatively. The recursive autoencoders are capable of generating vector space representations for variable-sized phrases, which enable predicting orders to exploit syntactic and semantic information from a neural language modeling’s perspective. Experiments on the NIST 2008 dataset show that our system significantly improves over the MaxEnt classifier by 1.07 BLEU points.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Infinite Hierarchical Bayesian Model of Phrasal Translation

Modern phrase-based machine translation systems make extensive use of wordbased translation models for inducing alignments from parallel corpora. This is problematic, as the systems are incapable of accurately modelling many translation phenomena that do not decompose into word-for-word translation. This paper presents a novel method for inducing phrase-based translation units directly from par...

متن کامل

Utterance Intent Classification of a Spoken Dialogue System with Efficiently Untied Recursive Autoencoders

Recursive autoencoders (RAEs) for compositionality of a vector space model were applied to utterance intent classification of a smartphone-based Japanese-language spoken dialogue system. Though the RAEs express a nonlinear operation on the vectors of child nodes, the operation is considered to be different intrinsically depending on types of child nodes. To relax the difference, a data-driven u...

متن کامل

Robust Example-based Dialog Retrieval using Distributed Word Representations and Recursive Autoencoders

Based on our previous work on example-based chat-oriented dialog systems that utilize a human-to-human conversation. Though promising, our previous simple retrieval techniques resulting a weakness on handling an out of vocabulary (OOV) database queries. In this paper we discuss an approach to increase the robustness of example-based dialog response retrieval. We employ a recursive neural networ...

متن کامل

Driving inversion transduction grammar induction with semantic evaluation

We describe a new technique for improving statistical machine translation training by adopting scores from a recent crosslingual semantic frame based evaluation metric, XMEANT, as outside probabilities in expectation-maximization based ITG (inversion transduction grammars) alignment. Our new approach strongly biases early-stage SMT learning towards semantically valid alignments. Unlike previous...

متن کامل

Improving word alignment for low resource languages using English monolingual SRL

We introduce a new statistical machine translation approach specifically geared to learning translation from low resource languages, that exploits monolingual English semantic parsing to bias inversion transduction grammar (ITG) induction. We show that in contrast to conventional statistical machine translation (SMT) training methods, which rely heavily on phrase memorization, our approach focu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013